Neuro Based Approach for Speech Recognition by Using Mel-frequency Cepstral Coefficients

نویسندگان

  • R.L.K. Venkateswarlu
  • R. Vasanthakumari
چکیده

NEURO BASED APPROACH FOR SPEECH RECOGNITION BY USING MEL-FREQUENCY CEPSTRAL COEFFICIENTS R.L.K. Venkateswarlu1 and R. Vasanthakumari2 1 Department of Information Technology, Sasi Institute of Technology and Engineering, Tadepalligudem, India, E-mail: [email protected]. 2 Perunthalaivar Kamarajar Arts College, Puducherry-605107, India, E-mail: [email protected]. This paper presents continuous speech recognition system based on neural network concept. Features are extracted and the data is compressed using Mel-frequency Cepstral coefficients method. These Mel-frequency Cepstral coefficients are used as inputs to train neural networks. Neural networks are useful to solve complex problems which do not require accurate solution. The backpropagation algorithm is used in multilayer perceptron. The solution found in this approach is convergent. This research work is aimed at speech recognition using multilayer perceptron neural networks. A small vocabulary of 11 words were established first, these words are upload, search, browse, import, export, send, remove, attach, help, format, install. These chosen words involved with executing some computer functions such as export a file or an image; find a file or a folder or a image; to view the data; download some properties; transfer out of a database or document in a format; to move a file; to delete a file; to add a file; some assistance; to delete existing content; to add new software. Are introduced to the computer and then subjected to feature extraction process using Mel-frequency cepstral coefficients. These features are used as input to an artificial neural network in speaker dependent mode. Half of the words are used for training the artificial neural network and the other half are used for testing the system. The system components consist of three parts, speech processing, feature extraction, training and testing by using neural networks and information retrieval. The retrieve process proved to be 81.44%–93.18% successful, which is quite acceptable, considering the variation to surroundings, state of the person, and the microphone type.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Voice-based Age and Gender Recognition using Training Generative Sparse Model

Abstract: Gender recognition and age detection are important problems in telephone speech processing to investigate the identity of an individual using voice characteristics. In this paper a new gender and age recognition system is introduced based on generative incoherent models learned using sparse non-negative matrix factorization and atom correction post-processing method. Similar to genera...

متن کامل

Acoustic Emotion Recognition Using Linear and Nonlinear Cepstral Coefficients

Recognizing human emotions through vocal channel has gained increased attention recently. In this paper, we study how used features, and classifiers impact recognition accuracy of emotions present in speech. Four emotional states are considered for classification of emotions from speech in this work. For this aim, features are extracted from audio characteristics of emotional speech using Linea...

متن کامل

Using Mel-Frequency Cepstral Coefficients in Missing Data Technique

Filter bank is the most common feature being employed in the research of the marginalisation approaches for robust speech recognition due to its simplicity in detecting the unreliable data in the frequency domain. In this paper, we propose a hybrid approach based on the marginalisation and the soft decision techniques that make use of the Mel-frequency cepstral coefficients (MFCCs) instead of f...

متن کامل

Speaker Recognition System Based On MFCC and DCT

This paper examines and presents an approach to the recognition of speech signal using frequency spectral information with Mel frequency. It is a dominant feature for speech recognition. Mel-frequency cepstral coefficients (MFCCs) are the coefficients that collectively represent the shortterm power spectrum of a sound, based on a linear cosine transform of a log power spectrum on a non linear m...

متن کامل

Hardware Implementation of Speech Recognition Using MFCC and Euclidean Distance

This paper suggests Digital Signal processor (DSP) based speech recognition system with improved performance in terms of recognition accuracies and computational cost. The comprehensive surrey of various approaches of feature extraction like Mel filter banks with Mel Frequency Cepstrum Coefficients (MFCC). This paper describes an approach of isolated speech recognition by Digital Signal Process...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011